Skip to content

Conversation

@EwanC
Copy link
Contributor

@EwanC EwanC commented Mar 26, 2025

To support the SYCL-Graph extension on an OpenCL backend, we currently only require the presence of the cl_khr_command_buffer extension. This PR introduces an extra requirement on the CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR capability being present.

This is based on the graph execution wording on the definition of handler::ext_oneapi_graph() that:

Only one instance of graph will execute at any time. If graph is submitted multiple times, dependencies are automatically added by the runtime to prevent concurrent executions of an identical graph.

Such usage results in multiple calls by the SYCL runtime to urEnqueueCommandBufferExp with the same UR command-buffer and event dependencies to prevent concurrent execution. Without support for simultaneous-use the OpenCL adapter code cannot guarantee that the first command-buffer submission has finished execution before it makes following clEnqueueCommandBufferKHR calls with the cl_event decencies. If the first submission is still executing, then an error will be reported.

Workarounds like adding blocking host waits to the OpenCL UR adapter are possible, but requiring simultaneous use reflects the vendor requirements as they are for the currently implementation. I've tried to document this all in the UR spec and SYCL-Graph design docs, which also includes a couple of cleanups I found along the way.

Note that the new CTS test fails for Level-Zero adapter, which I've created #17734 to resolve.

@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from 5a7ffa6 to 858ff7e Compare March 26, 2025 14:49
@EwanC EwanC temporarily deployed to WindowsCILock March 26, 2025 14:50 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 26, 2025 15:44 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 26, 2025 15:44 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from 858ff7e to f9b1937 Compare March 28, 2025 13:03
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 13:03 — with GitHub Actions Inactive
@EwanC EwanC marked this pull request as ready for review March 28, 2025 13:04
@EwanC EwanC requested review from a team as code owners March 28, 2025 13:04
@EwanC EwanC requested a review from keyradical March 28, 2025 13:04
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 13:23 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 13:23 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 15:14 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 16:50 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 16:50 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from c65615f to 6fb031f Compare March 28, 2025 21:11
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 21:11 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 21:26 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 28, 2025 21:26 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from 6fb031f to d610ee8 Compare March 31, 2025 07:57
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 08:42 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 09:17 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 09:17 — with GitHub Actions Inactive
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from d610ee8 to 19a961e Compare March 31, 2025 14:12
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 14:12 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 14:26 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock March 31, 2025 14:26 — with GitHub Actions Inactive
Copy link
Contributor

@keyradical keyradical left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just small nit.

Ewan Crawford and others added 3 commits April 1, 2025 17:27
To support the SYCL-Graph extension on an OpenCL backend, we currently
only require the presence of the `cl_khr_command_buffer` extension. This
PR introduces an extra requirement on the
[CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_API.html#CL_COMMAND_BUFFER_SIMULTANEOUS_USE_KHR) capability being present.

This is based on the [graph execution
wording](https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/experimental/sycl_ext_oneapi_graph.asciidoc#765-new-handler-member-functions)
on the definition of `handler::ext_oneapi_graph()` that:

>  Only one instance of graph will execute at any time. If graph is submitted multiple times, dependencies are automatically added by the runtime to prevent concurrent executions of an identical graph.

Such usage results in multiple calls by the SYCL runtime to
`urEnqueueCommandBufferExp` with the same UR command-buffer and event
dependencies to prevent concurrent execution. Without support for
simultaneous-use the OpenCL adapter code cannot guarantee that the
first command-buffer submission has finished execution before it
makes following `clEnqueueCommandBufferKHR` calls with the `cl_event`
decencies. If the first submission is still executing, then an error
will be reported.

Workarounds like adding blocking host waits to the OpenCL UR adapter
are possible, but requiring simultaneous use reflects the vendor
requirements as they are for the currently implementation.

I've tried to document this all in the UR spec and SYCL-Graph design
docs, which also includes a couple of cleanups I found along the way.
Taken from intel#17709

Co-authored-by:  Mikołaj Komar <[email protected]>
@EwanC EwanC force-pushed the ewan/graph_ocl_sim-use branch from f879d7b to 7fab3b7 Compare April 1, 2025 16:27
@EwanC EwanC temporarily deployed to WindowsCILock April 1, 2025 16:41 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock April 1, 2025 17:22 — with GitHub Actions Inactive
@EwanC EwanC temporarily deployed to WindowsCILock April 1, 2025 17:22 — with GitHub Actions Inactive
Copy link
Contributor

@Bensuo Bensuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@EwanC
Copy link
Contributor Author

EwanC commented Apr 3, 2025

@intel/llvm-gatekeepers This is good to merge thanks

@kbenzie kbenzie merged commit 7e7dffc into intel:sycl Apr 3, 2025
95 of 108 checks passed
EwanC pushed a commit to reble/llvm that referenced this pull request Apr 23, 2025
The merged PR intel#17658 added the
requirement for OpenCL cl_khr_command_buffer implementations to support
the simultaneous use capability in order to support UR command-buffers.

There was an oversight in this PR where the simultaneous use property
wasn't set on command-buffer creation in the OpenCL adapter. This is
rectified in this patch.
sommerlukas pushed a commit that referenced this pull request Apr 24, 2025
The merged PR #17658 added the
requirement for OpenCL cl_khr_command_buffer implementations to support
the simultaneous use capability in order to support UR command-buffers.

There was an oversight in this PR where the simultaneous use property
wasn't set on command-buffer creation in the OpenCL adapter. This is
rectified in this patch.
kbenzie pushed a commit to oneapi-src/unified-runtime that referenced this pull request Apr 24, 2025
The merged PR intel/llvm#17658 added the
requirement for OpenCL cl_khr_command_buffer implementations to support
the simultaneous use capability in order to support UR command-buffers.

There was an oversight in this PR where the simultaneous use property
wasn't set on command-buffer creation in the OpenCL adapter. This is
rectified in this patch.
KornevNikita pushed a commit that referenced this pull request May 27, 2025
The merged PR #17658 added the
requirement for OpenCL cl_khr_command_buffer implementations to support
the simultaneous use capability in order to support UR command-buffers.

There was an oversight in this PR where the simultaneous use property
wasn't set on command-buffer creation in the OpenCL adapter. This is
rectified in this patch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants